SCUMBLE: a method for systematic and accurate detection of codon usage bias by maximum likelihood estimation
نویسندگان
چکیده
The genetic code is degenerate--most amino acids can be encoded by from two to as many as six different codons. The synonymous codons are not used with equal frequency: not only are some codons favored over others, but also their usage can vary significantly from species to species and between different genes in the same organism. Known causes of codon bias include differences in mutation rates as well as selection pressure related to the expression level of a gene, but the standard analysis methods can account for only a fraction of the observed codon usage variation. We here introduce an explicit model of codon usage bias, inspired by statistical physics. Combining this model with a maximum likelihood approach, we are able to clearly identify different sources of bias in various genomes. We have applied the algorithm to Saccharomyces cerevisiae as well as 325 prokaryote genomes, and in most cases our model explains essentially all observed variance.
منابع مشابه
Maximum likelihood estimation of ancestral codon usage bias parameters in Drosophila.
We present a likelihood method for estimating codon usage bias parameters along the lineages of a phylogeny. The method is an extension of the classical codon-based models used for estimating dN/dS ratios along the lineages of a phylogeny. However, we add one extra parameter for each lineage: the selection coefficient for optimal codon usage (S), allowing joint maximum likelihood estimation of ...
متن کاملIdentification of Synonymous Codon Usage Bias in the Pseudorabies Virus UL31 Gene
Background: Little knowledge of synonymous codon usage pattern of pseudorabies virus (PRV) genome, especially the UL31 gene in the process for its evolution is available. Objectives: In the present study, the codon usage bias between PRV UL31 sequence and the UL31-like sequences was identified. Materials and Methods: We used a comprehensive analysi...
متن کاملBearing Fault Detection Based on Maximum Likelihood Estimation and Optimized ANN Using the Bees Algorithm
Rotating machinery is the most common machinery in industry. The root of the faults in rotating machinery is often faulty rolling element bearings. This paper presents a technique using optimized artificial neural network by the Bees Algorithm for automated diagnosis of localized faults in rolling element bearings. The inputs of this technique are a number of features (maximum likelihood estima...
متن کاملEstimation of Parameters for an Extended Generalized Half Logistic Distribution Based on Complete and Censored Data
This paper considers an Extended Generalized Half Logistic distribution. We derive some properties of this distribution and then we discuss estimation of the distribution parameters by the methods of moments, maximum likelihood and the new method of minimum spacing distance estimator based on complete data. Also, maximum likelihood equations for estimating the parameters based on Type-I and Typ...
متن کاملRates of nucleotide substitution and mammalian nuclear gene evolution. Approximate and maximum-likelihood methods lead to different conclusions.
Rates and patterns of synonymous and nonsynonymous substitutions have important implications for the origin and maintenance of mammalian isochores and the effectiveness of selection at synonymous sites. Previous studies of mammalian nuclear genes largely employed approximate methods to estimate rates of nonsynonymous and synonymous substitutions. Because these methods did not account for major ...
متن کامل